DOCUMENT RESUME 



ED 467 381 



TM 034 307 



AUTHOR 

TITLE 

PUB DATE 
NOTE 



PUB TYPE 
EDRS PRICE 
DESCRIPTORS 
IDENTIFIERS 



Henson, Robin K. 

The Logic and Interpretation of Structure Coefficients in 
Multivariate General Linear Model Analyses. 

2002-04-03 

35p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (New Orleans, LA, April 1-5, 
2002 ). 

Reports - Descriptive (141) -- Speeches/Meeting Papers (150) 
EDRS Price MF01/PC02 Plus Postage. 

Heuristics; ^Multivariate Analysis 

^General Linear Model; ^Structure Coefficients 



ABSTRACT 

In General Linear Model (GLM) analyses, it is important to 
interpret structure coefficients, along with standardized weights, when 
evaluating variable contribution to observed effects. Although often used in 
canonical correlation analysis, structure coefficients are less frequently 
used in multiple regression and several other multivariate analyses. This 
paper discusses and demonstrates the role of structure coefficients in 
multivariate analyses by: (1) illustrating structure coefficients in the 

univariate context with multiple regression; and (2) using canonical 
correlation analysis to demonstrate structure coefficients in the multivariate 
context. A small heuristic data set is used to make the demonstration 
concretely accessible for applied researchers. (Contains 4 tables, 3 figures, 
and 22 references.) (Author/SLD) 



Reproductions supplied by EDRS are the best that can be made 
from the original document. 



TM034307 



Structure Coefficients 1 

struct_mult . doc 

Running head: STRUCTURE COE EFFICIENTS 



oo 

m 

r- 

tI- 



Q 



w 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 



US DEPARTMENT OF EDUCATION 
Office of Educational Research and Improvement 
EDUCATIONAL RESOURCES INFORMATION 
J CENTER (ERIC) 

t/This document has been reproduced as 
received from the person or organization 
originating it. 

□ Minor changes have been made to 



• Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position cr policy. 



The Logic and Interpretation 
Multivariate General 



of Structure Coefficients 
Linear Model Analyses 



in 



Robin 

University 



K. Henson 

of North Texas 



Paper presented at the annual meeting of the American 
Educational Research Association, April 3, 2002, New Orleans. 
Correspondence concerning this paper should be sent to 
rhenson@unt . edu . 



2 



o 

ERIC 



ST COPY AVAILABLE 



Structure Coefficients 2 



struct mult.doc 



Abstract 

In General Linear Model (GLM) analyses, it is important to 
interpret structure coefficients, alongside standardized 
weights, when evaluating variable contribution to observed 
effects. Although often used in canonical correlation analysis, 
structure coefficients are less frequently employed in multiple 
regression and several other multivariate analyses. The present 
paper discusses and demonstrates the role of structure 
coefficients in multivariate analyses by (a) illustrating 
structure coefficients in the univariate context with multiple 
regression and (b) using canonical correlation analysis to 
demonstrate structure coefficients in the multivariate context. 
A small heuristic data set is used to make the demonstration 
concretely accessible for applied researchers. 
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The Logic and Interpretation of Structure Coefficients in 
Multivariate General Linear Model Analyses 

It is commonly known that the General Linear Model (GLM) 
serves as a general analytic system guiding all classical 
parametric analyses. Cohen (1968) demonstrated multiple 
regression as the univariate GLM. Knapp (1978) later illustrated 
that canonical correlation subsumed not only multiple regression 
but other multivariate analyses as the multivariate GLM 
umbrella. Structural equation modeling has since been shown as 
the most general case of the GLM, allowing simultaneous 
measurement and substantive modeling as part of the same 
analysis (Bagozzi, Fornell, & Larcker, 1981; Fan, 1997) . 

Understanding the foundational components of the GLM 
affords researchers wider utility and application of the various 
GLM analyses. Importantly, all GLM analyses have certain 
analytic characteristics in common. All analyses (a) are 
correlational in nature, (b) invoke a system of weights that are 
applied to observed variables to create synthetic (i.e., latent 
or unobserved) variables, (c) typically focus on the synthetic 
variables for analytic interest, and (d) yield r 2 -type effect 
sizes . 
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Determining Variable Importance 

In applied research, it is often important to identify 
variables that contribute to the model being tested. For 
example, an educational psychologist may use multiple regression 
to evaluate whether self-esteem, self-concept, and self-efficacy 
are predictive of academic achievement. The researcher likely 
cares about which, if any, of the variables is able to predict 
achievement and to what degree. Identification of variable 
importance, then, is fundamental to many of the analyses we 
conduct . 

However, within the GLM, all analyses yield r 2 -type effect 
sizes that must be considered prior to evaluating what variables 
contributed to this effect. It makes no sense, for example, to 
have a miniscule (and uninterpretable) effect size and yet try 
to identify variables that contributed to that effect. 
Accordingly, Thompson (1997) articulated a two-stage hierarchal 
decision strategy that can be used to interpret any GLM 
analysis : 

All analyses are part of one general linear model. . . . 

When interpreting results in the context of this model, 
researchers should generally approach the analysis 
hierarchically, by asking two questions: 

Do I have anything? (Researchers decide this 
question by looking at come combination of statistical 
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significant tests, effect sizes . . . and replicability 

evidence . ) 

If I have something, where do my effects originate? 
(Researchers often consult both the standardized weights 
implicit in all analyses and structure coefficients to 
decide this question.) (p. 31) 

Once notable effects have been isolated, then (and only 
then) interpretation shifts to the identification of what 
variables in the model have contributed to that effect. 
Traditionally, the weights (often standardized) present in all 
GLM analyses are examined to judge the contribution of a 
variable to the effect observed. Using regression as an example, 
many researchers would discount the value of a variable with a 
small or near-zero P (beta) weight. 

The sole interpretation of standardized weights, however, 
can lead to erroneous conclusions about variable importance. 
Burdenski (in press), Courville and Thompson (2001), Thompson 
and Borrello (1985) have documented the drawbacks of only 
consulting standardized weights in multiple regression. In GLM 
analyses, it is also important to interpret structure 
coefficients, alongside standardized weights, when evaluating 
variable contribution to the observed effect. 

Structure coefficients are much less understood within the 
GLM as compared to the role of standardized weights. 
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Nevertheless, the reporting and interpretation of structure 
coefficients is critical to identification of variable 
importance in both univariate and multivariate analyses. 

Purpose 

The purpose of the present paper is to discuss and 
demonstrate the role of structure coefficients in multivariate 
analyses. Accordingly, this paper will (a) illustrate structure 
coefficients in the univariate context with multiple regression 
and (b) use canonical correlation analysis to demonstrate 
structure coefficients in the multivariate context. A small 
heuristic data set is used to make the demonstration concretely 
accessible for applied researchers. 

Multiple Regression as a Univariate Example 
Where Does an Effect Size Come From? 

Fundamental to interpreting any GLM analysis is the size of 
the obtained effect, whether that effect be a variance- 
accounted-f or (e.g., R 2 , r| 2 ) or mean difference (e.g., Cohen's d) 
statistic (cf. Henson & Smith, 2000; Snyder & Lawson, 1993; 
Wilkinson & APA Task Force on Statistical Inference, 1999) . In 
regression, the effect size of interest is R 2 , which (when 
multiplied by 100) is the percentage of variance in the 
dependent variable that can be explained by the predictor 
variables . 
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In a hypothetical scenario where two predictors are 
perfectly uncorrelated, the effect size is the sum of the 
squared correlations between each predictor and the dependent 
variable (Y) : 



Equation (1) makes explicit the fact that the relationships 
between each predictor and Y are critical to obtaining the 
overall effect. 

In the real world, however, predictors are usually 
correlated to some degree. In such cases the predictors may 
explain the same variance in Y, and use of equation (1) would be 
inappropriate because dual credit would be given to more than 
one predictor. Standardized weights (P) can be derived, however, 
that "split up" the shared variance among the predictors so no 
two predictors are given credit for the same explained variance 
in Y. The appropriate equation then becomes: 



In the actual analysis, of course, the P weights are applied to 
the observed predictor scores (in Z score form) in a linear 

equation to yield a synthetic variable consisting of predicted Y 
scores that are as close as possible to the actual Y scores (for 
ordinary least squares regression) : 




( 1 ) 



R 2 — Pi-brxi + Pi-brx2 • 



( 2 ) 
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Y - PiXi + P1X2 . 



(3) 
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Because we care about how close the synthetic Y scores are to 
the observed Y scores, the effect size can also be stated as the 
squared correlation between the predicted scores and the 
dependent variable: 

R^rn- H) 

A 

Equation (4) informs us that the synthetic variable, Y , is 
critical in the regression, and therefore critical in result 
interpretation . 

Additionally, because the effect is in a squared metric, we 
can conceptualize the relationship between the predictors and Y 
graphically with Venn diagrams by representing the sum of 
squares of each variable. For example, in a multiple regression, 
assume the R 2 = .75 and the relationships between the two 
predictors and Y are r YX i 2 = .50 and r YX i 2 = .50. If we were to sum 
the individual predictors squared relationships with Y, we would 
get 100% explained variance, a result larger than the 75% 
effect! This tells us the two predictors must be explaining the 
some of the same part of the Y variance. If we assume the 
relationship between the two predictors is r x ix 2 2 = .25, then the 
graphical representation of the model might look like Figure 1. 



INSERT FIGURE 1 ABOUT HERE 
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Figure 1 demonstrates a case where both predictors explain 

A 

some of the Y variance (the Y area) , but also explain some of 
the same part of the Y variance (the double slashed area) . It is 
clear that both predictors are equally effective in predicting 
the dependent variable. However, the question for the linear 
equation (3) is: "What will be the magnitude of the p weights?" 
Because standardized weights cannot allow dual credit be 

A 

assigned to more than one variable for predicted ( Y ) area, 
within any regression the p weights will be derived to either (a) 
arbitrarily "split up" the shared predicted area or (b) 
arbitrarily assign the entire portion to one of the variables. 

However, should the shared area in Figure 1 be 
disproportionately divided between the predictors, then one p may 
be arbitrarily larger than the other, and therefore suggest that 
one variable is more important or contributes more to the 
predicted area than the other. Furthermore, if the study were to 
be conducted again, the ps may reverse their magnitudes for the 

two variables, a dilemma known as the "bouncing beta" problem. 
What is the " Structure " of the Effect? 

Because p weights cannot be examined to clearly identify the 
relationships between the predictors and the dependent variable 
when the predictors are correlated, more information is 
necessarily needed to interpret variable importance. Further, in 
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the GLM, we almost always are concerned with the synthetic 
variables for interpretation purposes. In multiple regression, 

the Y predicted variable is our primary focus, as suggested by 

A 

equation (4) . In Figure 1, the sum of squares of Y is 
represented by the slashed area, which is 75% ( R 2 ) of the total 

sum of squares of the dependent variable (or, put alternatively, 
•R — SS e xpiained / SStotai ) • The explanation of what variables 
contributed to this effect is central to interpreting variable 
importance . 

Structure coefficients are called such because they inform 
us as to the "structure" or makeup of the effect represented by 

the synthetic variable Y . By definition, a structure coefficient 
is a simple bivariate correlation between an observed variable 

(e.g., predictor) and a synthetic variable (e.g., Y). Notice 

that because they are bivariate correlations, structure 
coefficients do not take into account the collinearity between 
the predictors, and therefore shed important light on the 
importance of predictors . 

In the Figure 1 example, the squared structure coefficients 
between Y and XI and X2 would both be .67, because both 

predictors can account for two-thirds of the explained effect ( Y 
sum of squares) in and of themselves. Note that both structure 
coefficients (unsquared) would be the square root of .67, or +/- 
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.82, which is interpreted just as a Pearson r. If we sum the two 
squared structure coefficients (.67 + .67 = 1.34) the result is 
larger than 1.00, due to the fact that there is shared explained 
area. It should be apparent that if the predictors were 
perfectly uncorrelated, then the sum of the squared structure 
coefficients would be 1.00, because the predictors would account 

for unique portions of the Y variance. 

If standardized weights inform the researcher what 
variables are getting credit for the effect, then structure 
coefficients inform the researcher what variables could have 
gotten credit for the effect. Both coefficients are important, 
and both coefficients should be reported and interpreted in 
published research. In the above example, the claim that one 
variable is better than the other would be unfounded (yet 
definitely possible if consulting Ps), and examination of 
structure coefficients points to equal contributory value. 

So What's the Problem with Multicollinearity? 

Mul ticollinearity, or the presence of correlation between 
predictors, is often cited as a problem in multiple regression 
and therefore to be avoided. Stevens (2002), for example, 
stated : 

Multicollinearity poses a real problem for the researcher 
using multiple regression for three reasons: 

1. It severely limits the size of R, because the 
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predictors are going after much of the same 
variance on y. . . . 

2. Multicollinearity makes determining the importance 

of a given predictor difficult because the 
effects of the predictors are confounded due to 
the correlations among them. 

3. Multicollinearity increases the variances of the 

regression coefficients, (p. 91-92) 

These concerns are merited if considering only p weights, 
but they largely become mute when interpreting structure 
coefficients. Regarding (1), the predictors may explain the same 
part of Y variance, but R is not artificially limited. Of 
course, R will not get bigger unless additional portions of 
dependent variable variance are explained, but within the GLM, 
the addition of predictor variables will only result in R 2 either 
remaining the same (no additional variance explained) or getting 
larger. Regarding (2), structure coefficients clarify variable 
importance as noted above. Regarding (3), decisions based solely 
on P weights may be impacted by inflated standard errors if using 
statistical significance tests. However, structure coefficients 
are not impacted by inflated standard errors as they are 
descriptive correlational measures. Of course, P standard errors 
are only relevant when the researcher depends on statistical 
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significance testing, which may obfuscate variable contributions 
due to the impact of sample size on null hypothesis tests. The 
bottom line is that multicollinearity is not a problem in 
multiple regression, and therefore not in any other GLM 
analysis, if the researcher invokes structure coefficients in 
addition to standardized weights. In fact, in some multivariate 
analyses, multicollinearity is actually encouraged, say, for 
example, when multi-operationalizing a dependent variable with 
several similar measures. 

Canonical Correlation as a Multivariate Example 

Canonical correlation analysis (CCA) is a natural 
multivariate extension of multiple regression with which 
researchers can examine the relationship between several 
predictors and several dependent variables simultaneously 
(Henson, 2000; Thompson, 1984, 1991). In CCA, the several 
predictors are linearly combined into one synthetic predictor 
variable. This process is directly analogous to the creation of 

Y in multiple regression. However, in CCA, the dependent 
variables are also linearly combined to create one synthetic 

criterion variable (also analogous to Y). The canonical 
correlation itself is nothing more than a Pearson r correlation 

between the synthetic predictor and synthetic criterion 
variables. In CCA, this pair of synthetic variables is created 
for each canonical function (variate) . The first function 
maximizes shared variance between the observed predictor and 
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dependent variables. Subsequent functions are created (analogous 
to factors in factor analysis) that maximize explained variance 
for the residual (e.g., unexplained) variance left over from the 
previous function. 

In CCA, then, there are two synthetic variables for each 
function. Because structure coefficients are the correlation 
between an observed and a synthetic variable, there are structure 
coefficients that explain relationships between both the 
predictor and criterion synthetic variables and their respective 
observed variable sets. 

Table 1 presents a heuristic data set that will be used to 
help make the present discussion concrete. In this hypothetical 
example, the researcher is investigating whether two variables 
related to adult attachment, secure and preoccupied attachment 
styles, are predictive of variation in personality styles as 
measured by the ""Big Five 1 1 factors: neuroticism, extraversion, 
openness, agreeableness, and conscientiousness. Data are 
presented as T scores for 10 people. For SECURE and PREOCC, Z 
scores are parenthetically presented for later use. 



INSERT TABLE 1 ABOUT HERE 

The CCA for SECURE and PREOCC predicting NEURO, EXTRA, OPEN, 
AGREE, and CONSC yielded a squared canonical correlation of .682 
for the first function and .163 for the second function. (Note: 
There will be as many functions as there are variables in the 
smaller variable set, which in this case is two.) The Appendix 
presents the SPSS syntax used for the CCA. Supposing we deem only 
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the first function's effect noteworthy, we now concern ourselves 
with defining this function by identifying what variables 
contributed to the effect. 

Table 2 presents the standardized canonical function 
coefficients (directly analogous to P weights) and the structure 
coefficients for the first function. Examination of only the 
function coefficients might lead one to conclude the first 
function was largely the result of PREOCC predicting NEURO and 
AGREE. The prediction is positive due to the fact that all the 
structure coefficients have the same sign. Note as well that we 
cannot tell directionality by simply consulting the standardized 
weights, because when the observed variables are correlated, the 
standardized weights are not direct measures of relationship. In 
the present example, this is made explicit by the existence of a 
function coefficient greater than one for NEURO. 



INSERT TABLE 2 ABOUT HERE 

However, examination of the structure coefficients (r s ) and 
their squared values provides a more complete picture of the 
variable relationships. PREOCC is still clearly the best 
predictor, but it would be inappropriate to label SECURE as 
useless as it can account for over half of the synthetic 
predictor variable by itself, a fact obfuscated by the small 
standardized weight for SECURE. Additionally, if only consulting 
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the function coefficients, it appears clear that NEURO and AGREE 
are the predominant criterion variables with the rest of the 
variables having low or near zero coefficients. Although NEURO 
is the dominant criterion variable, the squared structure 
coefficients inform us that EXTRA in fact can account for about 
one-third of the synthetic criterion variable while AGREE can 
account for less than 3%, in spite of having the second largest 
standardized weight! 

Certainly consultation of only standardized weights can 
mask important relationships when variables are correlated. Also 
certainly, consultation of structure coefficients clarifies 
variable relationships in the presence of multicollinearity , 
which is almost always present in applied research. 

Construction of the Synthetic/Latent Variables 

Table 3 presents the calculations used to create the 
synthetic predictor variable for all 10 cases from Table 2. These 
calculations make explicit how the standardized weights are 
applied to the observed scores (in Z score form) to create the 
synthetic predictor variable, which as noted is directly 

analogous to Y in multiple regression. Remember that in the CCA, 
a similar equation is used that combines the dependent variables 
into one synthetic dependent variable. The present discussion, 
however, will focus only on the predictor side of the equation. 



INSERT TABLE 3 ABOUT HERE 
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Canonical Structure Coefficients 

The structure coefficients for SECURE and PREOCC can now be 
directly calculated from the Table 3 results as the correlations 
between the observed SECURE and PREOCC scores and the synthetic 
predictor variable created from those scores. These correlations 
are .752 and -.973 for SECURE and PREOCC, respectively. Note that 
these structure coefficients match exactly those in Table 2 
created by the CCA. 

Figures 2 and 3 graphically display the relationships 
between SECURE and PREOCC with the synthetic variable. In this 
scenario, both variables have strong relationships with the 
synthetic variable, and therefore should be considered in 
interpretation. Of course, consultation of only the function 
coefficients would not have led to the same conclusion. 



INSERT FIGURES 2 and 3 ABOUT HERE 
Structure Coefficients in Other Multivariate Analyses 

Structure coefficients are present throughout the GLM, and 
typically are necessary for result interpretation. However, the 
literature is inconsistent in defining the role of structure 
coefficients in various GLM analyses. Burdenski (in press) and 
Courville and Thompson (2001) have documented that authors 
seldom report structure coefficients in multiple regression and 
almost exclusively only consult weights when determining 
variable importance. 
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In CCA, there is some consensus that structure coefficients 
are necessary (cf. Meredith, 1964; Thompson, 1984). As Levine 
(1977) argued: 

I specifically say that one has to do this [interpret 
structure coefficients] since I firmly believe as long as 
one wants information about the nature of the canonical 
correlation relationship, not merely the computation of the 
[synthetic variable] scores, one must have the structure 
matrix, (p. 20) 

Cohen and Cohen (1983) also noted that "interpretation of a 
given canonical variate is best undertaken by means of the 
structure coefficients" (p. 456). 

Because CCA is the multivariate GLM, and because structure 
coefficients are critical for CCA interpretation, it stands to 
reason that interpretation of other GLM analyses would also 
require structure coefficients. As Huberty (1994) explained, 

if a researcher is convinced that the use of structure rs 
makes sense in, say, a canonical correlation context, he or 
she would also advocate the use of structure rs in the 
contexts of multiple correlation, common factor analysis, 
and descriptive discriminant analysis, (p. 263) 

However, like in multiple regression, structure coefficients 
are often ignored in other multivariate analyses. In factor 
analysis, for example, factors are uncorrelated when an 
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orthogonal rotation is used {e.g., varimax) . In such cases, the 
factor pattern matrix is the same as the factor structure matrix. 
The factor structure matrix is found by multiplying the factor 
pattern matrix (P VXF ) with the factor correlation matrix (R F x F ) . 

When the factors are uncorrelated, R F x F is an identity matrix. 
Therefore, the structure matrix (S VXF ) will be the same as the 
pattern matrix, such that: 

^VXF^FXF = ®VXF* ' 5 ' 

This outcome is analogous to a regression with perfectly 
uncorrelated predictors, and interpretation of a separate 
structure matrix is unnecessary. However, when the factors are 
correlated via an oblique rotation, the pattern and structure 
matrices will not be identical , and both matrices should be 

reported and interpreted, just as one would interpret (5 weights 
and structure coefficients in regression in the presence of 
correlated predictors. Unfortunately, empirical reviews of 
exploratory factor analyses indicate that structure matrices are 
often ignored (Henson, Capraro, & Capraro, 2001; Henson & 

Roberts, in press) . 

Nevertheless, structure coefficients are present throughout 
the GLM and should be consulted when considering variable 
importance. Table 4 lists several multivariate analyses, along 
with common names for the standardized weights used in the 
analyses. Also given is a description of what structure 
coefficients are correlating in a given analysis. 
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INSERT TABLE 4 ABOUT HERE 

Discussion 

When observed predictors/variables are correlated and 
combined into a synthetic variable, the definition of observed 
effects (e.g., R 2 ) must invoke examination of both standardized 

weights and structure coefficients. This is true in multiple 
regression, canonical correlation analysis, and throughout the 
GLM . 

One reasonable alternative to examining structure 
coefficients in multiple regression is to consult the 
correlations between the predictors and the dependent variable 
directly. Indeed, another way to derive structure coefficients is 
to divide the correlation between the predictor and dependent 
variable by the multiple R : 

r XY / R . (6) 

This equation informs us that all regression structure 
coefficients are the zero-order correlations between predictors 
and the dependent variable divided by a constant (R ) . Therefore, 

these zero-order correlations contain the same information as the 
structure coefficients. However, the information is in a 
different metric, with structure coefficients representing 

relationship with the synthetic effect (Y), which is of primary 
interest. So, the decision about whether to interpret the zero- 
order correlations with the dependent variable or structure 
coefficients depends on the researcher's desire to describe his 
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or her results in terras of the effect obtained or the observed 
variables . 

Pedhazur (1997) also argued that structure coefficients can 
be excessively large even when the effect size is not noteworthy. 
For example, a tiny R 2 = .01 effect might be found, and if this 

effect is almost entirely due to one of the predictors, the 
structure coefficient for that predictor might be, say, .90. 
However, within the hierarchal strategy for interpreting all GLM 
analyses, one would never interpret the origins of an effect 
without first declaring the effect to be worth interpreting. In 
this context, the concern about misinterpretation seems 
unwarranted . 

Furthermore, within multivariate analyses, there is not a 
single dependent variable with which to correlate the observed 
predictors. In this context, structure coefficients are essential 
for interpretation. 

The present paper has demonstrated the role of both 
standardized weights and structure coefficients in univariate and 
multivariate analyses. Both coefficients are important for 
determining variable importance. Unfortunately, structure 
coefficients are often ignored, and overdependence is placed on 
standardized weights, perhaps resulting in misinterpretation of 
substantive findings. The present paper may serve to inform 
applied researchers about (a) the presence of structure 
coefficients, (b) the conceptual underpinnings of what a 
structure coefficient is, and (c) how to interpret these 
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coefficients to determine variable importance in the presence of 
multicol linearity. 
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Table 1 

Heuristic Data for Canonical Correlation Example. 





Predictor 


Var. 


Criterion 


Var. 


Case No. 


SECURE 


PREOCC 


NEURO EXTRA OPEN 


AGREE CONSC 



1 


35 (- 


-1.593) 


52 ( 


. 747) 


51 


46 


57 


51 


50 


2 


36 (- 


-1 . 473) 


55 ( 


1.268) 


53 


51 


45 


52 


45 


3 


45 ( 


-.386) 


51 ( 


. 573) 


48 


35 


56 


55 


48 


4 


45 ( 


-.386) 


49 { 


.226) 


54 


42 


51 


38 


63 


5 


48 ( 


-.024) 


54 { 


1.095) 


50 


43 


38 


63 


37 


6 


50 ( 


.217) 


39 ( - 


-1 . 512) 


46 


50 


39 


49 


39 


7 


52 ( 


.459) 


44 ( 


-.643) 


52 


48 


57 


46 


43 


8 


53 ( 


. 579) 


3 9 ( - 


-1.512) 


39 


59 


50 


53 


58 


9 


59 ( 


1.304) 


4 9 ( 


.226) 


48 


62 


43 


34 


55 


10 


59 ( 


1.304) 


45 ( 


-.469) 


45 


50 


47 


59 


53 


Note . 


Values 


in parentheses for 


SECURE 


and PREOCC are Z : 


scores . 



Table 2 

Standardized 

Coefficients 


Canonical Function 
for Function One. 


Coefficients and 


Structure 


Variable 


Funct. Coef. 


r s 


r 2 

L S 


SECURE 


.284 


.752 


56.55% 


PREOCC 


-.809 


-.973 


94 . 67% 


NEURO 


- 1.217 


-.860 


73.96% 


EXTRA 


-.163 


.567 


32.15% 


OPEN 


. 064 


-.120 


1.44% 


AGREE 


-.745 


-.165 


2.72% 


CONSC 


-.367 


.190 


3.61% 



Note . r s = structure coefficient. r s 2 = squared structure 
coefficient times 100. The largest two function and structure 
coefficients for the criterion variables are in bold. 
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Table 3 

Calculation of the Synthetic Predictor Variable . 



SECURE 


(WEIGHT) 


+ 


PREOCC 


(WEIGHT) 


= 


Synthetic Predictor 


-1.593 


( .284) 


+ 


.747 


(-.809) 


— 


-1.057 


-1.473 


( .284) 


+ 


1.268 


(-.809) 


= 


-1.444 


-.386 


( .284) 


+ 


.573 


(-.809) 


= 


-.573 


-.386 


( .284) 


+ 


.226 


(-.809) 


= 


-.292 


-.024 


( .284) 


+ 


1.095 


(-.809) 


= 


-.893 


.217 


( .284) 


+ 


-1.512 


(-.809) 


= 


1.285 


. 459 


( .284) 


+ 


-.643 


(-.809) 


= 


.651 


. 579 


( .284) 


+ 


-1.512 


(-.809) 


= 


1.388 


1.304 


( .284) 


+ 


.226 


(-.809) 


= 


. 188 


1.304 


(.284) 


+ 


-.469 


(-.809) 


= 


. 750 




29 



CT> 




CM 




If) 




4-> 




3 




(U 


• 


-H 




O 


(D 


-H 




x 


C) 


M-l 




CD 




0 


q 


o 


rtf 




(D 


CD 


q 


M 




3 


•q 


4J 




o 




3 


OS 


M 


q 


4J 


(D 


CO 


q 



<u 



(D 



44 

3 

0 

a 

0 

q 

g 



CQ 

4J 



•H 

0 

•H 

CL) 

0 

Cj 



(D 

q 

3 

4J 

o 

3 

q 

4J 

CO 

TS 

q 

os 

cq 







4J 






rg 














u 




as 


0 






X 






• 






4-4 




<D 


i — 1 




IM 


3 






£ 






1 




q 


4-4 




rtf 


o 


Q) 


X 


3 


i — 1 


q 


X 




rtf 


4-4 


rtf 


4J 


0) 


E-« 


CO 







43 


















-P 


















-H 


















5 


















CO 


3 




X 


X 








, 


OS 


0 




3 


3 










r — 1 


■H 




as 


as 










X 


X 




X 


X 










as 


O 


• 


as 


as 








O 


-H 


3 


3 


X 


X 






• • 


X 


M 


3 


X 0 


\ 


\ 


X 




OS 


-P 


as 


X 


X -H 


o 


o 


X 


• 


44 


OS 


> 


N 


-H X 


X 


X 


X 


as 


<tf 


43 




as 


5 o 


X 


X 


3: 


X 


i — i 


-P 


3 


X 


3 


as 


as 




X 


a) 


3 


0 


as 


CO 3 


X 


X 


co 


as 


Jh 


>i 


-H 


X 


as x 


X 


X 


as 


X 


u 


CO 


M 


M 


X 


3 


3 


X 


M 


0 




OS 


as 


X X 


>i 


>1 


X 


as 


a 


43 


-P 


> 


as 3 


co 


CO 


as 


> 




-P 


-H 




x as 






X 




w 


X 


M 


• — i 


M 3 


X 


X 


u 


X 


44 


2 


O 


as 


as -H 


X 


X 


as 


3 


3 




N 


o 


> £ 


X 


X 


> 


as 


a) 


CO 


CO 


X 


-H 


3: 


3: 




X 


-H 


M 


M 


3 


X M 






X 


3 


U 


0 


0 


0 


3 O 


as 


as 


3 


as 


X 


-P 


44 


3 


as co 


• — i 


i — i 


as 


a 


4-1 


O 


O 


as 


X X 


X 


X 


X 


as 


4-4 


X 


X 


O 


3 X 


as 


as 


3 


X 


a) 


X 


X 




as 


X 


X 


as 




0 


OS 


OS 


O 


a o 


M 


M 


a 


o 


a 


U 


M 


-H 


as x 


as 


as 


as 


X 




a 


a x 


X X 


> • 


> • 


X 


X 


a) 




as 


as 


U 


U 




as 


Jh 


X 


X 


X 


X X 


X 0 


X 0 


X 


X 


3 


OS 


as 


X 


as x 


as x 


as x 


as 


X 


44 


> 


> 


3 


> 3 


> o 


> o 


> 


3 


u 


S-4 


u 


>i 


U >i 


U as 


u as 


u 


>i 


3 


OS 


OS 


CO 


as co 


as x 


as x 


as 


CO 


14 


CO 


CO 




CO 


CO 


CO 


CO 




44 


X 


X 




X 


X 


X 


X 




W 


O 


o 




o 


O 


o 


O 












co 


















X 


















3 


















as 


















X 








X 










O 








X 










-H 






co 


as 










X 






as 


0 


CD 




3 




X 








o 


0 




0 




as 






as 




1 — 1 




X 




0 






£ 


3 


as 




X 




o 






as 


0 


3 




o 










CO 


X 






3 




3 






\ 


X 






3 




0 


co 


co 


as 


o 


44 




X 




X 


X 


X 


£ 


3 


43 




N 




X 


3 


3 


as 


3 


CD 




as 




O 


as 


as 


3 


X 


-H 




X 




3 


X 


X 






a) 




as 


CO 


3 


o 


o 


3 


X 






X 


X 


X 


X 


X 


0 


3 






M 


3 




X 


X 




as 


X 




as 


as 


X 


X 


X 


CO 


3 


a) 




> 


X 


3 


as 


as 


3 


X 


N 






O 


as 


0 


0 


CO 


£ 


-H 




i — i 


X 


3 


o 


o 


3 


X 


X 


„ „ 


as 


X 


X 






as 


u 


}4 


CD. 


o 


X 


£ 


3 


3 


CO 


o 


as 


— - 


X 


as 


-H 


M 


M 


3 


CO 


X 




3 


0 


M 


as 


as 


0 


X 


3 


as 


0 


o 


O 


X 


X 


o 


X 


as 


44 


3 




CO 


X 


X 






44 


a) 


as 




X 


as 


as 


o 




C/) 


XI 


O 




X 


a 


a 














X 






X 












3 






0 












as 














3 




3 






CO 








0 




-H 






X 






3 


-H 




£ 




u 


CO 






0 


X 




X 




0 


>1 






-H 


as 








X 


X 






CO 


« — i 




O 




o 


as 






CO 


as 




CO 


CO 


as 


3 






a) 


M 




X 


X 


X 


as 






u 


M 




X 


CO 










Cn 


0 






>1 


>i 


as 






a> 


u 


CO 


as co 


1 — 1 


M CO 


X 


as 




u 




-H 


> X 


as 


0 X 


as 


o 






1 — 1 


CO 


■H CO 


3 


x co 


X 


3 


W 


a) 


as 


>1 


X >1 


as 


as >i 


M 


as 


-H 


i — i 


o 


1 — 1 


CL X 




£ x 


as 


X 


W 


CL 


-H 


as 


x as 


M 


M as 


> 


u 




X 


3 


3 


M 3 


0 


x 3 


X 


as 


r— 1 


44 


0 


as 


o as 


X 


x as 


X 


> 


as 


r— 1 


3 




co 


o 


3 


i — i 




3 


3 


as 




as 


as 


0 


3 






£ 


U 




Q 


Cm 


O 


S 





co 



o 

CO 




Structure Coefficients 30 



struct mult.doc 




Figure 1 . Venn diagram of multiple regression 
for R 2 = .75 with mult icollinearity . 



with two predictors 
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Figure 2 . Scatterplot between the observed SECURE predictor and 
the synthetic predictor variable. 
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Figure 3 . Scatterplot between the observed PREOCC predictor and 
the synthetic predictor variable. 
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Appendix 

title 'structure coefficients demo', 
comment Run canonical correlation. 

MANOVA 

neuro extra open agree consc WITH secure preocc 
/PRINT=SIGNIF (MLJLTIV EIGEN DIMENR) 

/ DISCRIM=STAN ESTIM COR ALPHA (.999). 
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